ml primers

Artificial Neural Networks: Matrix Form (Part 5)

To actually implement a multilayer perceptron learning algorithm, we do not want to hard code the update rules for each weight. Instead, we can formulate both feedforward propagation and backpropagation as a series of matrix multiplies, which leads to better usability. This is what leads to the impressive performance of neural nets - dumping matrix multiplies to a graphics card allows for massive parallelization and large amounts of data.

This tutorial will cover how to build a neural network that uses matrices. It builds heavily off of the theory in Part 4 of this series, so make sure you understand the math there! 

Read More

Artificial Neural Networks: Mathematics of Backpropagation (Part 4)

Up until now, we haven't utilized any of the expressive non-linear power of neural networks - all of our simple one layer models corresponded to a linear model such as multinomial logistic regression. These one-layer models had a simple derivative. We only had one set of weights the fed directly to our output, and it was easy to compute the derivative with respect to these weights. However, what happens when we want to use a deeper model? What happens when we start stacking layers? 

Read More

Artificial Neural Networks: Linear Multiclass Classification (Part 3)

In the last section, we went over how to use a linear neural network to perform classification. We covered using both the perceptron algorithm and gradient descent with a sigmoid activation function to learn the placement of the decision boundary in our feature space. However, we only covered binary classification. What if we instead want to classify a point belonging to one of $K$ classes?

Read More

Artificial Neural Networks: Linear Classification (Part 2)

So far we've covered using neural networks to perform linear regression. What if we want to perform classification using a single-layer network?  In this post, I will cover two methods: the perceptron algorithm and using a sigmoid activation function to generate a likelihood. I will not cover the delta rule because it is a special case of the more general backpropagation algorithm, which will be covered in detail in Part 4.

Read More

Artificial Neural Networks: Linear Regression (Part 1)

Artificial neural networks (ANNs) were originally devised in the mid-20th century as a computational model of the human brain. Their used waned because of the limited computational power available at the time, and some theoretical issues that weren't solved for several decades (which I will detail at the end of this post). However, they have experienced a resurgence with the recent interest and hype surrounding Deep Learning. One of the more famous examples of Deep Learning is the "Youtube Cat" paper by Andrew Ng et al.

It is theorized that because of their biological inspiration, ANN-based learners will be able to emulate how a human learns to recognize concepts or objects without the time-consuming feature engineering step. Whether or not this is true (or even provides an advantage in terms of development time) remains to be seen, but currently it's important that we machine learning researchers and enthusiasts have a familiarity with the basic concepts of neural networks.

This post covers the basics of ANNs, namely single-layer networks. We will cover three applications: linear regression, two-class classification using the perceptron algorithm and multi-class classification.

Read More

ML Primers

During my first year of studying machine learning, I've read a lot of papers, book chapters, and online tutorials. I like to learn by example. For me, theory paired with an implementation is the best way to learn a topic in machine learning. However, nearly every tutorial I've come across has a lot of one and little of the other. The ones that include both are usually presented at such a high level that they are of little use to someone trying to fully understand the topic at hand, or their code isn't even public!

Over the past few years, I've collated a lot of material from different sources to create a set of "primers" that are introductions to a wide array of machine learning topics. They include a theoretical foundation paired with a short tutorial with accompanying code written in Python. After going through one of these primers, you should have enough basic knowledge to begin reading papers about a particular subject without feeling too lost.

I'm currently in the process of posting these primers. Most of my original code was written in MATLAB, but it's not free and not everyone has access to it through their university or workplace, so I'm translating it to Python. The primers are aimed at an audience familiar with calculus and computer science so as not to "dumb down" any material, but I try to avoid using undefined terms or concepts.

I hope you find them helpful! If you have any issues with the code or material, feel free to leave a comment.